A Low-Resource ASR Back-End Based on Custom Arithmetic

نویسندگان

  • Xiao Li
  • Jonathan Malkin
  • Jeff Bilmes
چکیده

Most contemporary ASR systems running on desktops use continuous-density HMMs (CHMM) with floating-point representations. It is important to reduce their memory and power requirements so that they can be more affordable for portable devices. In this paper, we propose a novel speech recognition back-end based on custom arithmetic, where all floating-point variables are represented by integer indices and all arithmetic operations are replaced by a sequence of table lookups. One critical issue associated with table lookups is what we call an accumulative variable whose dynamic range is either large or unpredictable. Such a variable would introduce much distortion if quantized to low precision, so that the table lookup would incur a great loss of information. We therefore explore different quantization structures dealing with this problem in likelihood evaluation, and present a normalization method for the Viterbi search to make the range of the forward probabilities predictable. Furthermore, we investigate several optimization algorithms on system-wide bit-width allocation. The best algorithm uses 80 Kbytes of tables to achieve all back-end operations with only a slight degradation in system performance. As a side effect, the offline storage for parameters is reduced by 80%, and the memory requirement for online computation is reduced by nearly 70%. Keywords—speech recognition, low resource, quantization, normalization, optimization

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling?

Using deep neural networks (DNNs) for automatic speech recognition (ASR) has recently attracted much attention due to the large performance improvement they provide for a variety of tasks. DNNs are known to be robust to overfitting and to be able to remove speaker variability. Another important cause of variability in speech is the presence of noise. A lot of research has been undertaken on noi...

متن کامل

Implementation of Low-Cost Architecture for Control an Active Front End Rectifier

In AC-DC power conversion, active front end rectifiers offer several advantages over diode rectifiers such as bidirectional power flow capability, sinusoidal input currents and controllable power factor. A digital finite control set model predictive controller based on fixed-point computations of an active front end rectifier with unity displacement of input voltage and current to improve dynam...

متن کامل

Effect of Selective Home-Based Exercises on Chronic Low Back Pain in Unilateral Below Knee Amputees

Aims: Different studies have shown different effects of exercise on low back pain. The aim of the present study was to assess the effect of selective home-based exercises on chronic low back pain in unilateral below knee amputees. Materials and Methods: In this randomized controlled clinical trial with pretest-posttest design conducted in 2016, 53 unilateral below knee amputees with chronic lo...

متن کامل

Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling

Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...

متن کامل

Linear Prediction-based Dereverberation with Advanced Speech Enhancement and Recognition Technologies for the Reverb Challenge

This paper describes systems for the enhancement and recognition of distant speech recorded in reverberant rooms. Our speech enhancement (SE) system handles reverberation with blind deconvolution using linear filtering estimated by exploiting the temporal correlation of observed reverberant speech signals. Additional noise reduction is then performed using an MVDR beamformer and advanced model-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003